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Abstract 

Dualization of a monotone Boolean function on a finite lattice can be represented 
by transforming the set of its minimal 1 values to the set of its maximal 0 values. 
In this paper we consider finite lattices given by ordered sets of their meet and join 
irreducibles (i.e., as a concept lattice of a formal context). We show that in this 
case dualization is equivalent to the enumeration of so-called minimal hypotheses. In 
contrast to usual dualization setting, where a lattice is given by the ordered set of 
its elements, dualization in this case is shown to be impossible in output polynomial 
time unless P = NP. However, if the lattice is distributive, dualization is shown to be 
possible in sub exponential time. 


1 Introduction 

A monotone Boolean function on a finite lattice can be given by the set of minimal 1 values 
or by the set of its maximal 0 values. Dualization is the transformation of the set of minimal 
1 values of a Boolean function to the set of its maximal 0 values or vice versa. Since 
dualization is equivalent to many important problems in computer and data sciences Him 
|5], the paper [10] on quasi-polynomial dualization algorithm for Boolean lattices was an 
important breakthrough. It paved the way to generalizations to various classes of structures 
where dualization in output subexponential time is possible, among them dualization on 
lattices given by ordered sets of their elements or by products of bounded width lattices, like 
chains mu- 

A well-known fact is that every lattice is determined up to isomorphism by the ordered 
set of its meet (inhmum) and join (supremum) irreducible elements [H]. These elements 
cannot be represented as meets (joins) of other elements that are larger (smaller) then them. 
On diagram of hnite lattices these elements have one upper (lower) neighbor. In this paper 
we consider hnite lattices given by ordered sets of their meet and join irreducibles, known 
as concept lattices mmiiiii- We show that dualization for representation of this type is 
impossible in output polynomial time unless P = NP. However, in an important particular 
case where the lattice is distributive, we propose a subexponential algorithm. 

Dualization in the considered case is not only of theoretical interest. Actually, this study 
was motivated by a practical problem of enumerating minimal hypotheses, which is a problem 
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of learning specific type of classifiers from positive and negative examples. Hypotheses or 
JSM-hypotheses were proposed by V.K.Finn [BlIH] and formalized in terms of Formal Concept 
Analysis (FCA) in [1^1 HSl E] • The set of minimal hypotheses is classihcation equivalent 
to the set of all hypotheses, thus making a condensed representation of the latter. The 
set of all hypotheses can be generated with polynomial delay HZ], however, the problem 
of generating minimal hypotheses with polynomial delay remained an open one for long 
time. In this paper we show that dualization on lattices given by the ordered set of its 
irreducible elements is equivalent to enumeration of minimal hypotheses, thus complexity 
results concerning minimal hypotheses and dualization can be mutually translated. 

In what follows we shall use the notation of Formal Concept Analysis m , which provides 
a convenient language and necessary results for lattices given by ordered sets of irreducible 
elements. 

The rest of the paper is organized as follows: In the second section we give most important 
dehnitions. In the third section we prove the main intractability result on impossibility of 
enumerating minimal hypotheses and dualization in output polynomial time unless P = 
NP. In the fourth section we conclude by discussing the implication of the results for the 
problem of dualizing monotone Boolean functions. In the hfth section we relate minimum 
implication base problem to dualization over product of lattices that are given explicitly, 
and dualization over distributive lattice. In the sixth section we describe subexponential 
dualization algorithm for the distributive lattice case. 

1.1 Related work 

To the best of our knowledge all dualization problems that have been studied in previous 
works consider dualization over product of posets V = Vi x ... x Vk, where each poset Vi is 
some special type of a poset that is given explicitly. In [5l[7] the author give quasi-poIynomial 
time algorithms for the following cases: each Vi is a join semi-lattice of bounded width (any 
antichain has constant size), each Vi is a forest poset in which either the in-degree or the 
out-degree of each element is constant (see also 0), each Vi is the lattice of intervals dehned 
by a set of intervals on the real line M. In [5l a more general dualization problem was 
stated where each Vi is a lattice (with no bounds on its width), the existence of quasi¬ 
polynomial time algorithms for this case is still an open question. In this paper we prove an 
upper bound complexity of the latter problem via another long-standing open complexity 
problem, the minimum implication base (see [2], equivalently SID problem from [2T1I3]). The 
most common technique leading to quasi-polynomial time algorithm for duality problems are 
based on the idea of high frequency based decomposition, hrst introduced in [10]. We use 
this method to get subexponential algorithm for the dualization over distributive lattice. 

Although product of lattices C = Ci x ... x Ck, where each is given explicitly, can 
provide exponentially smaller description of C not every lattice can have a nontrivial expo¬ 
nentially smaller representation of this kind. 
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2 Preliminaries 


Definition 2.1. 0 A subset A ^ V of a partially ordered set (P, <) is ealled an antichain 
iff A ^ B for any A,B^A, i.e., all elements of an antichain are incomparable. 

The following property is required in finalization problems. For two antichains A,B^V 
we say (.4., B) has property (*) if 

A ^ B ioT any A e A, B e B {*). 

Definition 2.2. Antichains A,B V of partially ordered set V are called dual iff A,B 
satisfy property (*) and for any P G P either P < B for some B ^ B or A < P for some 

AeA. 


The finalization problem over partially ordered set usually have the following statement: 
Problem: Dualization over partially ordered set P 

INPUT: Partially ordered set P (that can be given implicitly), antichain A PP. 

OUTPUT: Antichain B PP such that A and B are dual. 

Note that the output B of the finalization problem can be exponential in the input 
size (|4.| X \[description of P]|). Therefore, we are interested in the time complexity of 
dualization that depends on both input and output sizes. We say that dualization problem 
can be solved in output polynomial time if there is an algorithm that can generate set B in 
time polynomial of \B\ x |4.| x \ [description of P]\. Usually we will consider decision version 
of the dualization problem called duality problem: 

Problem: Duality over partially ordered set P 

INPUT: Partially ordered set P (that can be given implicitly), antichains A,B 'TP satisfying 


QUESTION: Are antichains A and B dual? 

Equivalent definition of the dualization over poset can be given using monotone Boolear^ 
functions on a partially ordered set. Let / : P i—)■ {0,1} be a monotone Boolean function on 
a partially ordered set P, i.e. X <Y ^ f{Ai) < f(Y) and 4. is a set of minimal 1-values of 
/. Clearly, the set of maximal 0-values of / is dual to A. 

In this paper we consider only the case where the partially ordered set over which we 
dualize is a lattice. A partial ordered set (£, <) is called a lattice |1] if any pair of its elements 
has an infimum (meet A) and a supremum (join V). Equivalently, a lattice is an algebra 
(P, A, V) with the following properties of A and V: 


L1XVX = X, X A X = X (idempotence) 

L2XVU = UVX, X AY = Y A X (commutativity) 

L3 XV(YVZ) = (XVY)VZ, X A (Y A Z) = (X AY) A Z (associativity) 

L4 X = X A (X V U) = X V (X A U) (absorption) 

^We use capital characters to denote elements of partially ordered sets since it agrees with FCA notation 
for concept lattices. 

^Hereafter by Boolean functions we mean Boolean-valued functions. 
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A lattice is called complete if every subset of it has infimum and supremum. 

A lattice is distributive if for any X,Y, Z G £ 

X A {Y y Z) = {X AY) y {X A Z). 

The following elements of a lattice are very important in our work. An element X G £ is 
called infimum-irreducible (or meet-irreducible) if X ^ I\y>x^ i i-®-’ ^ represented 

by the intersection of any elements above it. Dually, an element X G £ is called supremum- 
irreducible {or join-irreducible) if X 7 ^ \Iy<x ^ 1 ^ represented by the union of any 

elements below it. Meet- (join-) irreducible elements have only one upper (lower) neighbor 
in the lattice diagram. 

In what follows we use the standard dehnitions and facts of Formal Concept Analysis 
(FCA) from [TT] . Let G and M be sets, called the set of objects and attributes, respectively. 
Let / be a relation I Y G x M between objects and attributes: for g & G,m & M, gim holds 
iff the object g has the attribute m. The triple IK = {G,M,I) is called a (formal) context 
and is naturally represented by a cross-table, where rows stay for objects, columns stay for 
attributes and crosses stay for pairs {g, m) E I. If A Y G,B Y M are arbitrary subsets, 
then the following derivation operators 

A' = {m G M I gIm Wg G A} 

B' = {g E G \ gim Vm G B} 

define Galois connection between ordered powersets (2*^,^) and (2^,C), since A C 
B' B C A'. The pair [A, B), where A C G, B C M, A' = B, and B' = A is called a 

(formal) concept (of the context Kj with extent A and intent B (in this case we have also 
A” = A and B” = B). Formal concepts are ordered by the following relation 

(Ai, B,) < (A 2 , B 2 ) iff Ai C ^ 2(^2 C B,), 

this partial order being a complete lattice on the set of all concepts. This lattice is called a 
concept lattice C{G, M, I) of the context (G, M, I). 

The set of join-irreducible elements of a concept lattice £(G, M, I) is contained in the set 
of object concepts, which have the form {g",g'), g E G. Dually, the set of meet-irreducible 
elements of a concept lattice is contained in the set of attribute concepts, which have the form 

{m',m"), m G M. An object g is called reducible if g' = M or 3X Y G \ {g} : g' = f] j', 

j&x 

i.e., the respective row of the context cross-table is either full or is an intersection of some 
other rows. If g is not reducible, then {g",g') is a join-irreducible element of C{G, M, I). 
Dually, an attribute m is called reducible if m' = G or 3Y C M \ {m} : m' = f] j', i.e. the 

jEY 

respective column of the context cross-table is either full or is an intersection of some other 
columns. If m is not reducible, then {m',m") is a meet-irreducible element of C{G, M, I). 

The Basic Theorem of FCA m implies that every hnite lattice (L, V, A) can be repre¬ 
sented as a concept lattice C{J{L), M(L), <), where J{L) is the set of all join-irreducible 
elements of L, M{L) is the set of meet-irreducible elements of L, and < is the natural partial 
order of (£, V, A). 


4 


A set of attributes B is implied by a set of attributes A, or implication A ^ B holds, 
if all objects from G that have all attributes from A also have all attributes from B, i.e. 
A' C B'. Implications obey Armstrong rules 

X X ^Y,YU Z 

X ^ X ’ XUZ ^Y ’ XUZ ’ 

and a minimal subset of implications from which all other implications can be deduced 
by means of Armstrong rules is called an implication base. In |2] a characterization of 
cardinality-minimum implication base (Duquenne-Guigues base) was given. 


3 Enumeration of minimal hypotheses 

Now we present a learning model from [H [9] in terms of FCA [T6l [T3l [T7] . This model 
complies with the common paradigm of learning from positive and negative examples (see, 
e.g. na, ini ) : given a positive and negative examples of a “target attribute”, construct a 
generalization of the positive examples that would not cover any negative example. 

Let t be target attribute, different from attributes from the set M, which correspond 
to structural attributes of objects. For example, in pharmacological applications the struc¬ 
tural attributes can correspond to particular subgraphs of molecular graphs of chemical 
compounds. 

Input data for learning can be represented by sets of positive, negative, and undetermined 
examples. Positive examples (or (-l-)-examples) are objects that are known to have the target 
attribute t and negative examples (or (—)-examples) are objects that are known not to have 
this attribute. 

Definition 3.1. Consider positive context ]K_|_ = (G+,M,X+j and negative context K_ = 
(G_,M,X_). The contextK± = (G+UG_, MU{tc},X+UX_ UG+x {tc}) is called a training 
context. The derivation operators in these contexts are denoted by superscripts (•)■•■, (•)“, 
and (-j^, respectively. 

Definition 3.2. A subset H Y M is called a positive (or {+)-)-hypothesis of training context 
if H is intent of ]K_|_ and H is not a subset of any intent of K_. For k E N U {0} a 
subset H Y M is called a k-weak positive (or k{+)-)-hypothesis of training context ]K± if H 
is intent o/]K+ and \H~^ fl G_| < k. 

Obviously, a positive hypothesis is a 0-weak hypothesis. Weak hypotheses stay for noise- 
tolerant dependencies, which are important in data mining applications. In the same way 
negative (or (—)-) hypotheses are dehned. 

Besides classihed objects (positive and negative examples), one usually has objects for 
which the value of the target attribute is unknown. These examples are usually called 
undetermined examples, they can be given by a context K.r := {Gr, M, A), where the corre¬ 
sponding derivation operator is denoted by (■)'”. 

Hypotheses can be used to classify the undetermined examples: If the intent 

g^ ■= {me M \ {g,m) G A} 
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of an object g ^ contains a positive, but no negative hypothesis, then g'^ is classified posi¬ 
tively. Negative classihcations are dehned similarly. If g'^ contains hypotheses of both kinds, 
or if g'^ contains no hypothesis at all, then the classification is contradictory or undetermined, 
respectively. In this case one can apply probabilistic techniques. 

In [13], [1^ it was argued that one can restrict to minimal (w.r.t. inclusion C) hypotheses, 
positive as well as negative, since an object intent g'^ obviously contains a positive hypothesis 
if and only if it contains a minimal positive hypothesis. 

Definition 3.3. For /c G N U {0} if the set of k{+) -hypotheses is not empty, then H is 
a minimal k{+) -hypothesis iff H is a k{+) -hypothesis and F is not a k{+) -hypothesis for 
any F <Z H. In case the set of k{+) -hypotheses is empty, we put the set of minimal k{-\-)- 
hypotheses consisting of the only set M. 

The latter condition is needed technically for dualization: without it not every monotone 
Boolean function would be dualizable. 

Example. Consider the following training context, where is the target attribute, the 
set of attributes is M = {mi,... the set of negative examples is G = {gi, g 2 ) 9z}) fhe 

set of positive examples is G+ = {g^,... ,gf\ and the incidence relation I is given by the 
following cross-table: 


G\M 

mo 

mi 

m2 

m3 

m 4 

mg 

mg 

9i 



X 

X 


X 

X 

92 


X 


X 

X 


X 

93 


X 

X 


X 

X 


94 

X 


X 

X 

X 

X 

X 

95 

X 

X 


X 

X 

X 

X 

96 

X 

X 

X 


X 

X 

X 

97 

X 

X 

X 

X 


X 

X 

98 

X 

X 

X 

X 

X 


X 

99 

X 

X 

X 

X 

X 

X 



Here, we have 2 ^ = 8 minimal hypotheses: {mi,m2,m3}, {mi,m2,mg}, {mi,mg,m3}, 
{mi, mg, mg}, {mi, m2, m3}, {mi, m2, mg}, {mi, mg, m3}, {mi, mg, mg}. 

In what follows we will also need the following dehnition from FCA, which is important 
in constructing “hard cases” for FCA-related complexity problems. 

Definition 3.4. Let G = {gi, ..., gn} and M = {mi,..., m^} be sets with same cardinality. 
Then the context IK = {G,M,Xfi) is called contranominal scale, where = G x M \ 
{(^i,mi),...,(^„,m„)}. 

The contranominal scale has the following property, which we will use later: for any 
H C M one has H" = H and H' = {g^ \ ^ H,1 < i < n}. 

Here we discuss algorithmic complexity of enumerating all minimal hypotheses. Note 
that there is an obvious algorithm for enumerating all hypotheses (not necessary minimal) 
with polynomial delay mi. This algorithm is an adaptation of an algorithm for computing 
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the set of all concepts, where the branching condition is changed to include the additional 
condition fl G_| < k. 

Problem: Minimal hypotheses enumeration (MHE) 

INPUT: Positive and negative contexts IK+ = (G+, M,X+), ]K_ = (G-^M^XJ) 

OUTPUT: All minimal hypotheses of K-t. 

Unfortunately, this problem cannot be solved in output polynomial time unless P = NP. 
In order to prove this result we study complexity of the following decision problem. 

Problem: Additional minimal hypothesis (AMH) 

INPUT: Positive and negative contexts ]K_|_ = (G+, M,X+), ]K_ = (G'_,M,X_) and a set of 
minimal hypotheses Pi = {hfi,..., Hk}. 

QUESTION: Is there an additional minimal hypothesis H of ]K± i.e. minimal hypothesis H 
such that H ^ PL. 

Algorithm 1 FindNewMinII(]K+, ]K_, "H) 

Require: DecideAMH(]K_|_,]K_,"H) = true 
1: for g G G+ do 
2 : {g^ nh^ \ h e G+} 

3: II <= {(^, m)\me g,g e G^} 

4 : GP^{g+nh-\heG_} 

5 : /£ <^= {(o, m) I m G 0, 0 G Gp] 

6 : ^K{Gl,Mng+,Il) 

7 : KP ^KlGP,Mng+,E) 

8 : W ^{h\hCg+,hen} 

9: if DecideAMH(K^ ,KP,PL3) then 

10 : return FindNewMinII(]K+, Kl, 

11: end if 

12: end for 
13: return M 


Lemma 3.1. AMH is in P iff MHE can he solved in output polynomial time. 

Proof. (*^) Assume there is an output polynomial algorithm A that generates all minimal hy¬ 
potheses in timep(|G+|, |M|, |X+|, |G-|, |X_|, N), where N is the number of minimal hypothe¬ 
ses. Use this algorithm to construct A' that makes hrst p{\G+\, |M|, |X+|, |G-|, |X_|, k + 1) 
steps of A. Clearly, if there is more than k minimal hypotheses, then A' generates k + 1 
minimal hypotheses, hence we can solve AMH in polynomial time. 

(^) Now suppose there is a function DecideAMH (]K_|_, ]K_, "H) that solves AMH problem 
instance in time Oft). We can use AlgorithmUl to hnd an additional minimal hypothesis if 
there is one. Clearly line 2 to line 8 can be computed in time (9((|G+| -|- |G_|)|M|). 
Also note that the total number of recursive calls can not be greater than \M\. Thus, time 
complexity of the AlgorithmU\is 0((|G+| -|- \GQ)\M\H). Let us prove the correctness. First 
note that since hypotheses are closed in ]K_|_ the additional minimal hypothesis must be a 
subset of some g^,g & G+, or it could be M. By dehnition the context ]K5J_ dehnes exactly 
all closed sets of IK that are subsets of g~^. It remains to note that at the last recursive call of 
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Algorithm\^I^eci(leKMY{{Ki^,MP_,W) does not hold for any g G G+. Thus, the only possible 
additional minimal hypothesis that can be returned is M. □ 


Now we prove A^P-completeness of AMH through the reduction of the most known NP- 
complete problem - satisfiability of CNF - to AMH. 

Problem: CNF satishability (SAT) 

INPUT: A Boolean CNF formula /(xi,..., x„) = Ci A ... A Cfc 
QUESTION: Is / satishable? 

Consider an arbitrary CNF instance Ci,... ,Ck with variables xi,..., Xn, where Ci = V 
... V lirQ, 1 < i < k and Uj G {xi ,..., Xn} U {“'Xi,..., -'Xn} (1 < * < 1 < j < a) are 

literals, i.e., variables or their negations. From this instance we construct a positive context 
]K_|_ = (G+,M,X+) and a negative context ]K_ = (G_,M,X_) . Define 

M = {Cl, . . . , Cfc} U (xi, -'Xi,..., Xn, -'X„} 

\.9x\ , 9^X\ ) • • • ) 9xn 7 9^Xn } C {9Cl 7 ■■■ 7 9Ck } 

{9h 7 ■■■ 7 9ln } 

The incidence relation of the positive context is dehned by X+ = Xc UX^ UX=, where 

^ = {{ 9 xi 7 Cj) \ Xi ^ Cj,l <i <n,l < j < k} 

U {{g^xi 7 Cj) I -'Xi ^ Cj,l <i <n,l < j < k} 


^ 7 ^ {9x17 9^X17 • • • 7 9 x„ 7 9^Xn} ^ {^ 1 ) ~'^l7 ' ' ' 7 ^7X7 } 

{ {,9x\ ) ^^l) , 7 ~'^l) ) • • • ) {.9xn7 ^n) ) {9^X„ 7 ~'^n)} 

= { (fi'Cl 5 C"!) 5 • • • 5 (fi'Cfc 5 Ck)} 

that is for Ath clause fl {(y'xi, 5'^xi, • • •, 9 x ^7 9^xn} is the set of literals not included in Ci, 
X^ is the relation of contranominal scale. 

The incidence relation of the negative context is given by X_ = Xc where 

Xc = G_x {xi, -.xi, ...,Xn, -'X„} 

- {{9117X1), {gi^,^Xi), ..., {gi^,Xn), {9lr.7 -^Xn)} 



As the set of minimal hypotheses we take "H = {{Ci}, {<^ 2 }, • • •, If is easy to see 

that ]K± with "H is a correct instance of AMH. 

If a hypothesis (not necessary minimal) is not contained in "H we will call it additional. 

Proposition 3.2. If H is an additional minimal hypothesis ofK± then 
H C {xi,^Xi,...,Xn,^Xn}. 

Proof. Suppose H ^ {xi, -ixi,..., Xn, ~'Xn}, then since H is not empty there is some Cj G H, 

1 < i < k. But is a minimal hypothesis and thus it does not contain any hypothesis. 

Hence H = Ci and this contradicts the fact that H is an additional minimal hypothesis. □ 

For any H C {xi, -iXi,..., Xn, ~'Xn} that satishes {a;*, -'Xi] fl 7 ^ 0 for any 1 < i < n we 
dehne the truth assignment (pn in a natural way: 

I true, if Xi E H; 

PHyXij \ f 1 -t A TJ 

\ false, 11 Xi ^ H] 

In the case {xi, -^XifPH = 0 for some 1 < i < n, ipn is not dehned. We dehne (pHixf) = true 
even if {xi, -'Xi) C H, although in this case it can be dehned by eigther way. 

Symmetrically, for a truth assignment ip dehne the set = {xi \ ip{xi) = true} U {-'Xi \ 
p{xi) = false}. 

Below, for H C {xi, -'Xi,... ,Xn, ~'Xn} we will denote the complement of H in {xi, -'Xi, .. . ,Xn, ~'Xn} 
by H. 

Proposition 3.3. If a subset H C {xi, -'Xi,... ,Xn, ~'Xn} is not contained in the intent of 
any negative example (i.eklg E G_,H ^ g~), then is defined. Conversely, for a truth 
assignment p the set H^p is not contained in the intent of any negative concept. 

The proof is straightforward. 

The following theorem proves NP-hardness of AMH. 

Theorem 3.4. AMH has a solution if and only if SAT has a solution. 
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Proof. (^) Let H be an additional minimal hypothesis of ]K±. First note that by Propo¬ 
sition 13.21 and Proposition 13.31 the trnth assignment (fjj- is correctly dehned. Since H is 
a nonempty concept intent of IK+, Proposition 13.21 together with the fact that is the 
relation of contranominal scale implies H~^ = \ Xi G H} U {g^xi \ ~^Xi G H}. Now 

n {Cl, C 2 ,..., Cfc} = 0, hence for any Ci {1 < i < k) there is some gi G snch that 
gi ^ Cf'. According to the dehnition of Ic the latter means that literal I belongs to danse 
Cj. Thns f{(p-ff) = true. 

(*^) Let ip he a. trnth assignment and f{ip) = true. Dehne H = H^. Note that H'^ = 
{dxi I ^ U {g^xi I G because is the relation of contranominal scale and 
H n g'^. = (/),1 < i < k. Suppose that Ci G for some 1 < i < k. This is equivalent to 
C Cf'. Hence, by dehnition of Xc, there is no literal I G such that I G C,. Therefore, 
the clause Ci does not hold and this contradicts the fact that (p satishes CNF /. Thus 
= H and iL is a hypothesis. Since H does not contain any {C,}, it must contain an 
additional minimal hypothesis. □ 

Corollary 3.5. MHE cannot he solved in output polynomial time, unless P = NP. 

4 Dualizing monotone Boolean functions on lattices 

Let / be a monotone Boolean function on a lattice C. Without loss of generality we can 
assume that £ is a concept lattice C = ^{C, M, I) of the corresponding formal context 
IK = {G,M,I). Then A C B ^ f{{A,A')) < f{{B,B')). It is known that any monotone 
Boolean function on a lattice is uniquely given by its minimal 1-values, i.e. by the set 
A = {{A, A') I [A, A') G 23,/((A, A')) = l,f{{B,B')) = 0 VH C A}. Dehne positive 
context ]K_|_ = IK. Dehne negative context ]K_ = (G_,M,/_) via its set of objects intents 
G_ = {pa I (A', A) G A} and gA~ = A. In other words negative examples are precisely 
intents of minimal 1-values of /. Clearly set of minimal hypotheses of ]K± is exactly the set 
of maximal 0-values of /. 

Symmetrically, for a given positive and negative contexts IK+ and ]K_ dehne context 
]K_|_u_ = (G+ U C-, M, J+ U /_). Let / be a monotone Boolean function on IK+u. that is given 
by its minimal 1-values A = {{g~',g~) \ g G G_} ((■)' - derivation operator of IK+y.). It 
is not hard to see that the set of maximal 0-values of / is dehned by the set of minimal 
hypotheses of IK-t. 

From Corollary 13.51 it follows that the following problem cannot be solved in output 
polynomial time unless P = NP 

Problem: Maximial false values enumeration (MFE) 

INPUT: A formal context IK and a set of minimal 1 values of monotone Boolean function / 
on the concept lattice of IK. 

OUTPUT: Set of maximal 0 values of /. 

Lemma [XT] also implies that the dualization problem on a lattice given by a formal context 
can be solved in output polynomial time ih the corresponding duality (decision version of 
dualization) problem can be solved in polynomial time. 

Note that in the case of Boolean lattice MFE problem is polynomially equivalent to 
Monotone Boolean Dualization and minimal 0 values in this case can be enumerated in 
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quasi-polynomial time where N is \input size\ -{- \output size\ (see |10j). 

In database theory a closure of a set of attributes A is dehned by means of iterated 
applications of functional dependencies with premises contained in A. Same type of closure, 
by means of implications instead of functional dependencies, is known in FCA. More precisely, 
applying imp(y4) = AU {B \ D ^ B,D ^ A} iteratively to A by putting at each next step 
A :: = imp(y4) until saturation, one obtains implicational closure of A, which is equal 

to A" in]- So, the set of all implications of a context dehnes the closure operator (■)", 
closed subsets of attributes, which together with the respective closed subsets of objects 
(extents) give the concept lattice. Hence, instead of dehning a lattice by the ordered set 
of its irreducible elements, one can dehne it in terms of the set of all valid implications of 
the respective formal context, or, equivalently, by its implication base. This consideration 
poses another setting of the dualization problem, where the lattice - instead of the set of 
positive examples G+ - is given by its implications or implication base, and one has to dualize 
the monotone function given by the set of examples G_. When the lattice is Boolean, its 
implication base is empty mi, so one has to dualize the set of examples G_, which can be 
considered as a monotone DNF, where disjunction goes over objects - elements of G- - which 
themselves can be taken as conjunctions of the respective attributes. When the lattice is 
distributive, its minimum implication base has one-element premises [TT] (hence, the number 
of implications in the base is not larger than |M|), so it can easily be computed from the 
context in polynomial time, and vice versa. Therefore, the dualization on lattices given by 
implication bases for distributive lattices is polynomially equivalent to the dualization on 
lattices given by contexts (ordered sets of irreducible elements), which we study in the next 
section. The study of dualization problems for lattices given by implication bases is motivated 
by simple linear-time reciprocal translations of implications to functional dependencies [IB] 
and propositional Horn theories [1]. 

In [15] it has been proven that the following problem is NP-hard: 

Problem: Incremental maximal model (IMF) 

INPUT: Horn theory d) and a set of its maximal models S. 

QUESTION: Is there another maximal model of <I> not contained in S'? 

In terms of FCA a Horn theory corresponds to a set of implications N and maximal 
models correspond to inclusion maximal closed sets of N, or object intents, that are not M. 
In the dualization setting maximal closed sets are dual to the singleton set {M}. Hence for 
the 

Problem: Minimal true values enumeration, on lattice given by implication base (MTEIB) 
INPUT: A lattice C{II) given by an implication base N and a set of maximal 0 values of 
monotone Boolean function / on the lattice 
OUTPUT: Set of minimal 1 values of /. 

we have the following 

Corollary 4.1. A solution of MTEIB is impossible in output polynomial time unless P = 
NP. 
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5 Dualization and minimum implication bases 

In this section we give complexity npper bounds of some important special cases of monotone 
Boolean dualization on lattices in terms of the complexity of minimum implication base 
problem (i.e. minimum Horn theory). 

Problem: Minimum implication base recognition (MIBR) 

INPUT: Formal context IK = (G, M, J), set of implications J. 

QUESTION: Is N implication base of IK? 

The complexity of MIBR problem is a long standing open problem. The only known 
complexity result is that MIBR is at least hard as monotone Boolean duality m^- 

As we have shown monotone Boolean duality on a lattice given by a formal context is 
coNP-complete. It turns out that if we additionally have an implication base as input then 
the problem does not get harder than MIBR. 

Problem: Duality over lattices given by formal context and implication base (DCI) 
INPUT: formal context IK = {G,M,I), antichains A,B<T £(]K) satisfying (*), implication 
base N of £(]K). 

QUESTION: Are A and B dual on £(K)? 

Note that N could be any implication base of IK that is not necessary minimum. Now 
we describe polynomial (Karp-)reduction of DCI to MIBR. Let us dehne a context IKg = 
(Ge,M, Jg), where Gb = {qb \ g ^ G,B G B} (|Ge| = |G| x |H|), and relation Jg is 
dehned via object intents g'^ = g' O B for any gs G Gs- Obviously, a set X C M is 
closed in Kg iff X is closed in K and there is R G H that X B. Dehne implication base 
Jji, = J O {A ^ M \ A ^ A}. Clearly, a set X is closed (satished) in Jj\^ iff X = M or X is 
closed in K and A X for any A & A. Thus A and B are dual on £(K) iff J7)4 is implication 
base of Kg. We have proven: 

Lemma 5.1. MIBR is DCI-hard (under polynomial Karp-reduction) 

In [HI [7] the problem of dualization over product of lattices was considered. For the 
case of semi-lattices of bounded width Elbassioni has shown that the duality problem can 
be solved in quasi-polynomial time. Nevertheless in case of product of general lattices the 
existence of quasi-polynomial algorithm is still an open problem. Here we prove that this 
problem is not harder than MIBR. 

Problem: Duality over product of lattices (DPL) 

INPUT: Product of lattices C = Ci x ... x given by £i,..., antichains A,B<TC 
satisfying (*), 

QUESTION: Are A and B dual over £? 

Proposition 5.2. MIBR is DPL-hard (under polynomial Karp-reduction) 

Proof. First note that given a lattice £* (e.g. as a whole relation matrix) we can End all join- 
irreducible and meet-irreducible elements of £* in poly{\Ci\) time. Thus it is possible to get 
context K^. = (Gj, Mi, If) that dehnes lattice Li in polynomial time. In order to construct 
a formal context K/; = (G, M, I) of the product of lattices L, we define G = Gi U ... U Gk, 
M = Ml U ... U Mk, and relation I. Without loss of generality let g E Gi and m G Mj then 
gim iS i j or gRm. It is straightforward to check that £(K) is isomorphic to L. 
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In [20] (Lemmas A.2 and A.3) it was proven that (in FCA terms) Given a formal 
context IK = {G, M, I) one can compute its cardinality-minimum implication base J in 
0(|Mp|£(]K)p) time. Moreover, such a J contains at most \M\‘^\C{^\ implications. Thus 
for a given lattice Ci we can hnd implication base of size 0{poly{\Ci\) in time 0{poly{\Ci\)). 
Clearly, J7i U ... U J/it is an implication base of IK/;. The proposition statement follows from 
Lemma 15.11 □ 

Another interesting special case of lattices for which we can establish similar complexity 
bound is the case of distributive lattices. It is known that for a given context IK of a 
distributive lattice, the minimum implication base of IK has size polynomial in |IK| and can 
be found in polynomial time (HU)- Thus MIBR is in P for a distributive lattice. The 
following Corollary is directly implied from this fact and Lemma 15.11 

Corollary 5.3. Dualization on distributive lattice problem: Given formal context K of a 
distributive lattice and antichains A,B £(IK) satisfying (*), decide whether A and B are 
dual or not? Is not harder than MIBR (under polynomial Karp-reduction). 

6 Dualization over distributive lattices 

We assume that a distributive lattice is represented as a lattice jC(V) of downsets (order 
ideals) of a poset V, and poset V is given by a matrix n x n. It is well known that any 
distributive lattice has such a representation [Diiain]- Note that one can use formal context 
representation of the distributive lattice as well, since the size of the corresponding formal 
context (P, P, <) is polynomial in n, and our dualization algorithm is subexponential. 

We treat the elements of £ = C{V) as subsets of V (since they are downsets of P), so for 
two downsets A,Be £(P) A < B means that A B. For an element p E V, the smallest 
(by set inclusion) downset that contains p is denoted by \.p , and the smallest upperset (order 
hlter) that contains p is denoted by fp. More generally, for any subset X C P, by fX we 
denote the smallest downset that contains X, i.e. fX = Up^xiv- 

Let A and B be antichains of a distributive lattice C{V). Further on we will call a triple 
of the form {{A,B),V) dualization problem input. Note that in the degenerate cases where 
M = 0 or P = 0 the duality can easily be tested in polynomial time. If A is empty, then B 
is dual to M iff P = {P}. If P is empty, then A is dual to P iff M = {0}. Let us call the 
algorithm that tests duality in these two degenerate cases EasyTest((A,B),P). 

We will also use the notion of frequency of an element p E V. Let C be some set of subsets 
of P (i.e. C C 2^), then the frequency of p in C is the fraction of elements of C that contain 

p: 

Definition 6.1. freqe{p) = \{C EC\p E (Fjl/ICI. 

Let us denote C = {P \ C | C G C}, thus by dehnition freq-^{p) = \{C E C \ p ^ Cjl/ICI. 
For convenience we dehne the quantities X = |M| + |P|, and m = maxpg-p (| ip | + | tP I) 
(note that m >2). 
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6.1 Algorithm 

Here we describe a subexponential algorithm for testing duality on a distributive lattice. 
The structure of the algorithm is close to that in |T0] . The algorithm decomposes the initial 
problem instance into smaller instances and solves them recursively. In order to keep the 
total number of recursive calls subexponential at each decomposition step, the algorithm 
tries to select an element of V such that either it is frequent or it has a large fraction of 
successors of predecessors. 


Algorithm 2 TestDuahty((A,, H), P) 

Require: A,BC C{V) 

1: if A = 0 or H = 0 then 
2 : return EasyTest((A, S), P) 

3: end if 
4: n 4= |P| 

5 : maxpgp (| Ip | + | tP I) 

Q: N = \A\ + \B\ 

7 : if m > then 
8 : argmaxpgp (| Ip | + | tp |) 

9: else 

10 : if maxpgp freq^ip) < max^gp freq^ < ^^en 

11: ret nr n false 

12: end if 

13: p ^ argmaxpgp (max(/reg^(p), freq^{p))) 

14: end if 

15: return TestDuality((Ai, Hf), P \ J,p) A TestDuality((A 2 , ^ \ tp) 


To describe decomposition performed by our algorithm we dehne the following four sets: 

A^, = {A\ip \AeA}, Bl = {B\Ip \ pe B, Be B}, 

= {A I p ^ A, A G A}, BP = {B\fp\Be B}. 

Note that Bi = {B \fp\fpCB, Be H}, and A2 = {A | "I'p fl A = 0 , A G A}. 

The following lemma proves the correctness of Algorithm\^ 

Lemma 6.1. For any p eV, A and B are dual iff the following two conditions hold: 

Af and Bi are dual on CfP \ fp), 

A2 and Bf are dual on P(P \ tp) 

Proof. (<=) Let us £x arbitrary X e C. Consider two possible cases: p e X and p ^ X. If 
p e X then since A^ and B{ are dual, either Ai C X \ 4,p for some Ai G A^, or X \ 4,p F Bi 
for some Bi G B^. Clearly, X \ 4,p F Bi implies X F Bi U fp e B. On the other hand 
Ai G A^ implies that there is A G A such that Ai = A \ 4,p, and hence A C X (since 
ip C X). 
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\i p ^ X then since and dual either A 2 ^ X \ "\p for some A 2 G or 

\ tP ^ B 2 for some B 2 G By dehnition B 2 G implies that there is B E B such that 
B 2 = B \ "[p. Note that A 2 G A, and X = X \ "[p ^ B 2 ^ B. 

(=^) Let us prove that and B^ are dual. Consider arbitrary X G C{V \ip). Because 
A and B are dual on C{V) either A X U Ip for some A E A, or X U Ip C 5 for some 
B E B. If y4 C X UIp then A\lp C X (since |p flX = 0). If XU|p C 5 then X C B\lp, 
and by dehnition B\lp E B^. It is easy to check that {Ai, B^) has property (*) 

Now we prove that A^ and are dual. Consider arbitrary X G C{V \ "\p). Note that 
X G C{V). Because A and B are dual on C{V) either X C X for some ^4 G or X C 5 
for some B E B. If A C X then p ^ A, and A E A^- li X C B then X C S \ |p (since 
Ip nX = 0). It is easy to check that (^ 2 ;'^ 2 ) has property (*). 

□ 


The following lemma helps one to establish a lower bound on the frequency of the most 
frequent element of V. 

Lemma 6.2. If A and B are dual then 

(3/4)l^hrn^ ^ ^ 

AeA B&B 


Proof. To prove this bound we use the ’method of expectations’ similar to that in [lO], but 
with a more tricky probability distribution. Suppose we hxed some probability distribution 
of X G £. Let us denote the expected number of X G .A, X C X by and the expected 
number of i? G S, X C 5 by Ejg. Antichains A and B are dual iff for any X E C either 
A C X, for some A E A, or X ^ B, for some B E B. Thus if A and B are dual, then 
E_a, + Ei 3 > 1. By linearity of expectations where Ea is probability that 

A C X. Similarly, Eq = 'YIb^b^b, where Eb is the probability that X <E B. Unlike to the 
case of Boolean lattice, no analytical expression for Ea and is known (even the existence 
of a polynomial approximation algorithm is an open question 0), but we can hnd upper 
bounds for Ea, A G A and Eb, B E B. 

In order to generate random (but not uniform) element X E C we select each p E V 
with probability 1/m. Suppose elements pi,p 2 ,... ,Pr have been selected, then the resulting 
downset X G £ is dehned as X = |pi U |p 2 U ... U fpr ■ 

For a given downset A G A let us bound the probability that A C X. To each p E V 
we assign an event Ip such that p E X. Note that Pr{Ip) > (1 — l/m)™ > 1/4 (since 
m > 2). Consider any maximum-cardinality set {oi, 02 ,..., a^} C A such that events 
Iai,Ia 2 , ■ ■ ■ ,Iak djce mutually independent. For any a G A event happens only if some 
q > a was selected, hence la is independent of all Ig for g ^ | (|a). Since | | (|a) | < m^ it 
is easy to see that k > |A|/m^. Since event A C X happens if A Jai A ... A la^ we have 
Pr(A C X) < (1 - Pr(LA < (1 - 

To bound Eb, note that for any B E B, the probability Pr{X B) = Pr{X r\(V\B) = 
0). This probability is exactly (1 — l/m)l^\^l = (1 — l/m)"“l'®l < e~^'^AB\)/m q 


Corollary 6.3. If A and B are dual, then at least one of the following statements is true: 


• Bp eV : freqAip) > 


1 

m log4/3 N 
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• 3peV : freq-^ip) > 


m2 log4/3 N 


Proof. Let kA = minyig^ |A|/m^, ks = minsgs (n — |i?|)/m, and k = min(fc^,/c^). By 
Lemma lOl + ZIbg/? Hence (3/4)^A^ > 1 which yields 

k < log 4/3 ^ ■ Since {A, B) has property (*), for any A E A, B E B the intersection AD B 
is nonempty. If |y4| = kmf, then there is some a E A snch that freq-g{a) > l/{km‘^) > 
l/(m^ log 4/3 iV). Similarly, if \B\ = km, then there is some b ^ B snch that freqA{b) > 
l/(/cm) > l/(mlog 4 / 3 iV). □ 

Theorem 6.4 (Time complexity of the dnalization algorithm). Algorithmic decides duality 
in time 20(«° ®Bog 3 (IA+|B|))^ 


Proof. First note that all lines of Algorithm\^can be compnted in polynomial time (disregard¬ 
ing recursive calls). In order to bound the number of recursive calls during an execution of Al- 
gorithml^ we consider the following problem volume quantity: vol{A,B,V) = |.4,| ■ |i3|-n. Du- 
alization problem {A, B, V) branches into two subproblems {A^, B\, V\lp) and {A 2 , , P \ 

fp). Let us denote the volumes of these problems by vol, voli, and V 0 I 2 , respectively. In 
case of line 13 by Corollary 16.31 either V 0 I 2 < (1 — ^ Jgjy )no/ or voli < (1 — 'y;yyy^)'^ol. 
Moreover, in case of line 8 of the Algorithm\^ m = | ip | + | tP I > which implies either 
voli < {n — '^)/n- vol < (1 — y^ji)vol, or V 0 I 2 < (1 — -y^j 3 )vol. Thus, we have the following 
bound on the number of recursive calls: A{vol) < A((l — yy 2 jky^)vol) + Afvol - 1) + 1. In 
[To] it has been proven that solution A{v) of the recurrence A{v) < 1 -|- A((l — £)n) -|- A{v — 
1), A(l) = 1 can be bounded by A{v) < (3 -|- Substituting e = ^rPt^iogN 

A{v) < (3-|-log® □ 


7 Conclusion 

In this paper we have studied the dnalization problem on a lattice given by the ordered sets 
of its irreducible elements (i.e., as a concept lattice). For this representation, the dnalization 
problem has complexity different from that in case of explicit lattice representation as an 
ordered set of all its elements. We have shown that the dnalization problem for a lattice given 
by the ordered set of its irreducible elements (concept lattice) is equivalent to the enumeration 
of minimal hypotheses, which is not possible in output polynomial time unless P=NP. For 
the case of distributive lattices dnalization was shown to be possible in subexponential time. 
We have proved that the long standing open complexity problem of constructing minimum 
implication base (irredundant Horn CNF) is at least as hard as dnalization over distributive 
lattice or dnalization over the product of explicitly given lattices (open problem stated by 
Elbassioni HI)- 

It is still open whether dnalization over distributive lattice can be solved in output quasi¬ 
polynomial time, or this problem cannot be solved in output polynomial time unless P = 
NP. The complexity of dnalization for other important classes of lattices, such as modular, 
also remains an open question for the case where the lattice is given by the ordered set of 
its irreducible elements. 
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